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According to the results of the theoretical conformation analysis and available experimental data, the known 
immimocytokines can be divided into two groups: a-helical (IFNs-*, 0, y, «; IL-2, 3, 4, 5, 6, 7; C~, M-, GM-CSFs: 
cMGF PDGF) and /3-pIeated proteins (IU:ia, ft TNFS-a, fi). IFNs-a, 0, y , «, IL-6\ G-CSF,_cMGF^ere shown J 
form a family of a-helical globular proteins characterized by a statistically significant homology jn amino acid sequences 
and by common features of the secondary structure formation. Comparison of the sequences of 72 IFNs-a, 0 « reveals 
three clusters of conservative amino acid positions. Their participation in the formation of active sites of IFN-a, 8 « h 
supposed. 
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Introduction 



acid 



their three^ensi^ 

sion of a common evolutionary ancestor and possible 
functional relationship of distantly related proteins 
ought to.be derived- first, from the secondary structure 
homology and only then from the homology of amino 
acid sequences [1] which could be absent altogether (2). 
Previously, on the basjs of the secondary structure ho- 
mology and functional similarity of IFNs-a, and y it 
was concluded that these proteins had a common evolu- 
tionary origin [3,4], though- the cursory comparison of 
amino acid sequence of IFN-y with those of IFNs-a 
and P did not reveal any statistically valid homology 
[5). Using our method of estimation of statistical valid- 
ity of homology in the structure of hydrophobic cores of 
high a-helical globular proteins; evidence was obtained 



Abbreviations: Hu, human; Mu, murine; 1FN, interferon; II, inier- 
leukin; CSF, colony-stimulating factor, BSF, B-cell stimulating fac- 
tor; PDGF, platelet-derived growth factor; p28"\ transforming pro- 
tein of simian sarcoma virus; PTM, aj-prolhymosin-a,; cMGF. 
chicken myelomonocytic growth factor, M, macrophage; G, granulo- 
cytic. 



that IL-2, IL-3, .PDGF and p28 ,u ought to have the 
tluee-draensional s^ctu^ similar, in general features, 
_. i . we suppose that a 

'/and GFjs 'form a 
[^frwtuxaJ^similamy.ajid 

The cDNA and genes of several new ILs and CSFs 
have been cloned by now and information on the 
nucleotide sequence has been obtained which has been 
translated, respectively, in the. amino acid sequence of 
the proteins coded by them. In a number of cases, a 
primary structure homology between IFNs. (on the one 
hand) and ILs, CSFs (on the other hand) was revealed 
(eg., between IFN-& (BSF-2), G-CSF and cMGF (9,10), 
and between IL-5 and IFN-y (11]). Besides, 1FN-& 
displays functional activity of the ILs and is also known 
as IL-6 [12]. 

In the present work we aimed at studying, by the 
methods of theoretical conformational analysis, sec- 
ondary structures of IFNs, ILs and CSFs in order to 
obtain additional information on: possible structural 
and evolutionary relationships between them, and on 
the basis of the analysis of conservative positions and 
homology in the protein amino acid sequences to local- 
ize probable active sites. 

Methods 



The computer program based on the method of 
estimation of protein secondary structures from their 
amino acid composition [13] was used in our work. The 
NOTICE Tnis matunai may Do protected by 
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method employs the results of correlation and regres- 
sion analyses of interrelation between the content of 
two types of secondary structure ( a-helices and 0-sheets) 
and amino acid composition of 80 proteins. The error of 
percent estimation of a-helices and /3-sheets by means 
of the method does not exceed ±7 and 10%, respec- 
tively, as compared to the data of X-ray analysis ob- 
tained for 28 proteins (14). 

To predict the localization of secondary structures 
wc used the methods of Zav'yalov [15], Novotny and 
Auffray [16], Efimov [17], Lim [18], Ptitsyn and Finkel- 
stein [19], Chou and Fasman [20] and Gamier [21]. 
Statistical validity of primary structure homologies was 
evaluated according to the method of Finkelstein [22]. 

Domain boundaries were determined on the basis of 
the algorithm of Vonderviszt and Simon [23]. The 
method is based on statistical data on the preference 
occurrence of amino acid residues at the n and n + 1, n 
and n + 2 positions of the polypeptide chain. Domain 
boundaries correlate with the minima of the preference 
profiles. 

Results 

Table I (columns 3, 4) shows the results of estimation 
of a-helices and ^-sheets for IFNs-ct, /?,.« and y 9 ILs-2, 
3, 4, 5, 6, 7, CSFs^M-CSF, G-CSF and GM-CS^Jand. 
cMGF. To facilitate the analysis, the data are presented.; 
for human proteins*(except^ 
the calculations were carried out for analogous proteins • 
of the other animal species and yielded similar results 
(the data are not presented). As seen from Table I, the 



content of a-heiices in the proteins under analysis is 
predicted within the range of 55-75%, whereas the 
content of 0-shects varies from 0-31%. The experimen- 
tal data available (columns 5, 6) agree well with the 
results of estimation from amino acid composition (ex- 
cept for the evaluation of 0-sheets in GM-CSF). Thus, 
the proteins listed in Table I share a common feature, a 
fairly high extent of a-helicity. They possess except for 
IL-3, four exons (column 8) in the part-of the gene 
corresponding to the mature protein lacking a leader 
sequence. For M-CSF, the number of exons is indicated 
for the N-terrninal part of the mature protein formed 
due to the putative proteolytic cleavage of the mem- 
brane-linked precursor [40]. 

The second logical step of the analysis requires infor- 
mation on localization of secondary structure segments 
of the polypeptide chain. As an example of analysis of 
immunocytokine secondary structure by different meth- 
ods, we present the results obtained forlFr^orA; 

Fig. 1 shows the results of prediction of. secondary 
structure and domain borders for IFN-crA by means of 
different methods (15-21]. All the methods predict a 
considerable amount of a-helices and insignificant 
quantity of 0-sheets. If one identifies secondary struc- 
ture sites predicted by all the methods used, this would 
yield five segments of a-helices and , none of ; /?-sheets. 
All the turns are found in the region /between the.nelical - 
segments. According to the method ^jk^^^i^A^d.;: 
./ Simon (23y one can identity 
3#£tu^^ 

100-120 residues. Analogous , c^culations were per-* 
formed by us for the other types of IFN-o, a number of 
IFN-0 and IFN-w (the results are not shown). Since 



table i 

Some features of a -helical proteins, immunocytokines 



No. 



Protein name 



Theoretical 
determi nation 
of content 



Experimental determina- 
tion of content 





a% 




a% 


Hu IFN-crA 


66 


. 15 


45-70 (CD) {24] 


Hu IFN-/3 


63 


\ 9 


70 (CD) 1271 


Hu IFN-y 


64 


13 




Hu IFN-cj, 


68 


16 




Hu IL-2 


58 


28 


46-65 (CD)|25J 
65 (XR) (261 


Hu IL-3 


54 


7 




Hu IL-4 


6] 


31 




Hu IL-5 


56 


19 




Hu IL-6 


67 


6 




HulL-7 


72 


16 




Hu M-CSF 


55 


I 




Hu G-CSF 


73 


27 


66 (CD) (54) 


Hu GM-CSF 


55 


0 


47 (CD) (28| 


cMGF 


71 


7 





Quantity of amino 
acid residues 
in the mature 
protein 



Quantity of 
exons in a 
mature pro- 
tein gene 



23-25 (CD)|25] 



" (CD) |54] 
46(CD)|28] 



165 (29] 
166 [30] 
146 (3U2] 
172(33] 
133(34] 

133 (351 

129(36) 

112(37] 

184(38] 

152(39] 

165(40] * 

177(42] 

127 (43] 

178(10] 



* Emanations arc given in the text. 



4 (31,32) 

4(34| 

5(35| 

4(38) 

4(41) 
4 [42| 
.4(43] 
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ihcy practically coincided wiih the data for IFN-aA, we 
shall consider the results of secondary structure predict- 
ions presented at the top of Fig. 1 (general) as common 
ones for IFNs-or, /?, u. 

Recently the three-dimensional structure of 1L-2 has 
been discovered [26], A schematic model of this struc- 
ture and the supposed interaction with receptors are 
shown in Fig. 2B. It was supposed earlier that IL-2 and 
IFNs must have a common, in general features three- 
dimensional structure [6J. Therefore, we attempted to 
fold the predicted a-helical segments of IFNs in the 
three-dunensional structure established experimentally 
for IL-2 [26J. The result is shown in Fig. 2A. It should 
be noted that in the IL-2-likc model of IFNs (Fig 2A) 
cysteme residues are found at distances not hindering 
the formation of disulfide bridges between them In 
general, the direction of the polypeptide chain in the 
model presented in Fig. 2A, coincides with the predic- 
ted structure shown in Fig. 6 and Fig 7A of Ref 13 In 
the model of IL-2 presented in Fig. 2B, the regions 
important for biological activity (according to the ex- 
perimental data [25,44]) are shaded. 

It is known that there exists the correlation between 
the coriservativity of amino acids at definite positions 
and their affiliation to structurally and/or functionally 
significant fragments of the polypeptide chain. 



Our analysis of conservative positions in 77 lev 
reveals 17 positions at which there are no substi. • 
or only one exists (the alignment of 72 IFNs wa T 
provided by Dr. A. Ya. Strongin, Institute of mI? 
Genet.cs, Moscow, personal communication) iT' 
posiuons are indicated in Fig. 2A by circles (, ha T! 
circles denote hydrophobic amino acid residues u ° 
shaded the hydrophilic ones). Figures near one-ie, 
symbols correspond to their positions in the amino 5 
sequence of the mature protein. As'seen from Pi, 2a 
a I the conservative residues can be divided into th *' 
clusters: the 'loop' one- L, L R p V 

« V s„ i& fc 
L 96. *m. Y no . L„„ C, 3 „ A, w , W M1 and V ]44 

Conservative residues of the 'hydrophobic' cluster 
are located on the converged segments of helices D and 
n a r!lf eem '? f ° r slabiIiz *»on of the similar, for 
all IFNs. conformation of the loop between the helices 
Attention to the role of the given loojjjfcr JFN functioh 
still increases due to the homology revealed between the 
corresponding sequence of IFN-a 2 and . thymus 
hormone, PTM-a, [4]. Fig. 3 shows the alignment of the 
C-tennmal sequence of IFN-« 2 , starting from residue 
116 and of PTM-a,. Probability of random coincidence 
tor the sequences compared is P = 5 - lO -4 . 

Probability of random coincidence was determined 




£?sz 1231 {W) * Cyiindm and ^ i** Kfis " :^zt , of d r in bon>en by ,hc mc,hod ° f 

hncs denote secondary slruc.ures. ,he assignmen. of which remaTns ambiguous ( /> sLnds o " he^c 7? !, »«*« 
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Fig. 2. (A) - Schematic presentation of the putative three-dimensional I FN structure obtained due to folding of the predicted a-helical segments of 
IFNs-o, 0, u into the three-dimensional structure established experimentally for IL-2 [26) - (B). The a-helical segments in accord with their 
sequence from the N- to C-tennini of the polypeptide chain are designated by cylinders and large letters in alphabetic order. Conservative residues 
in IFNs-o, 0, w are-indicated by circles with symbols and numbers: open circles denote hydrophilic residues, shaded circles hydrophobic residues. . 
The letters SS mark the positions of disulfide bonds. The supposed active sites in IFNs and IL-2 are shaded. The curved line indicates the outline of 

the putative binding sites of receptors. 



according to the formula: P= W Z X [22]. The W 
value (probability of obtaining 'by chance* of random 
coincidences between two proteins in the region H 
amino acids long) was estimated from the binomial 
distribution of random values; 



Wm y y (Mm 

fc m /!<//-/)! \ 20 j 1 20 J 



J where n is the number of coincidences. In calculations 



the number of possible independent comparisons be- 
tween primary structure fragments including deletions 
or insertions must be taken into account. Z denotes the. 
number of deletion distributions: 



Z = K\{H-L-K-\)\' 2 



where K is the deletion number and L is the summa- 
tion length of deletions, X is the number of combina- 



HU IFM-ALPHA 2 
PTM -ALPHA 1 



120 130 U0 150 160 

* * * ****** ******* *** 

SILAVRKYFQRITLYLKEKKYSPCAWEVVR-AE IMR-At'IMRSF SLSTNL-QE 

SDAAV0TSSEITTK0LKEKK EV VE EAENGR D AP ANGN AQNE ENG EO 6 

10 20 30 AO 



Fig. 3. The alignment of primary structures of lFN-a 2 (29) and PTM-o, [52], 



lions of deletion lengths: 



u-iy. 



LKEKK "H I r ?r ' he .^n***" fragment 

LKEKK .dentical for IFN-a, and PTM-a, corresponds 
to the loop between helices D and E in IFNs. Due to 
the fact that PTM-a, is an immunomodulatory hormone 
a is reasonable to assume that the loop formed by 



helices D and E is a part of the site responsible r 
immunomodulatory activity. funsioie for IF* 

Assuming that the conservative amino ^ 
•loop' and -hydrophilic' clusters as well ? V *« 
between helices D and E directly partfcba.^ ^ 
formation of active sites of IFNs-a / „ 1 ?~ 10 *c 
indicated the orientation of the molecule' relative ^ * 
temattve receptor, ft is seen that the 
of the mteracion of IFNs-a. P with the reS * 
stmtlar to analogous interaction for"l L -2 (F,g 2 B * * 





IL-5 



IL-7 



TL-6 



G-CSF 



OVCSF 




Ki: 
lir 



03 

m 

0> 
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segments coded by separate exons. 
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IFNans 




IL-2CTCGF1 PDGF MG^-T : Ny32 (IL-BflL-3 
G-CSF 



Fig. 5. The putative evolutionary relationships among proteins of the 
supcrfamily of a-helical globular (interferon-like) proteins. Double 
lines indicate the statistically valid homology of primary structures, 
solid lines the statistically valid homology of the hydrophobic cores 
• (6.7J. broken lines indicate functional relationship. 



Fig. 4 shows the results of secondary structure pre- 
dictions for the proteins listed in Table I. The computa- 
tion was carried out by the method of Efirnov (17), 
which determines the localization of the segment of 
a-helices capable of building in the hydrophobic core. 
The proteins are seen to have a definite extent of 
similarity in the secondary structure. In most cases, in a 
separate mature protein, five a-helical segments are 
identified. Besides, Fig. 4 presents the accumulative 
data on the intron localization along, the amino acid 
sequence. There exists a certain correlation between the 
location of a-helices and : that of introns.rThe finft^and 
second exons contain, ats a rule^one hebbc, ; the1 third 
exon two helices 'and^fo^^ 
a-helical segments. The exception is IL-3 whose ad- 
ditional intron is located between the two last a-helices. 



In addition to analogies in secondary structure and 
introivexon organization of the gene, in a number of 
cases, the homology between the proteins under analysis 
is observed directly in primary structures. Thus, the 
homology is shown between IFN-a, p and IFN-y [46], 
I FN-/* and IFN-& (IL-6) [45], IFN-& and G-CSF [9]! 
cMGF and G-CSF and IL-6 [10], IL-3 and IL-5 [11]. 
Table II presents estimations of probability of random 
coincidences (P) of these homologies. It can be consid- 
ered that only in case of IL-3 and IL-5 the observed 
homology [5] is not statistically valid. Fig. 5 shows 
possible evolutionary relationships between these pro- 
teins. Double lines indicate the statistically valid ho- 
mology in the fragments of primary structures com- 
pared. Single lines denote the statistically valid ho- 
mology in the structure of hydrophobic core [6,7]. 

Discussion 

In the present work attempt is made~To analyse 
possible structure-evolutionary relationships between 
immunocytokines. The analysis of their possible sec- 
ondary structures demonstrated that most of them are 
a-helical proteins sharing some common features of 
distribution of a-helices along the polypeptide chain 
(IL-1 and.TNF were not analysed, in this paper, since \ 
the X-ray. analysis revealed the ^structure type of thcart 
folding [47,49]). In particular, the relationship between 
vihe lpc^n^ Jth^t^introns was dembn^ 

&?^¥cw,.these^ 



similarity in the three-dimensional structures of the 
proteins under study. Nevertheless, for some proteins 
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TABLE II 

Probability of random coincidence of amino acid residues in immunocytokines 



No. Protein Numbers of the N- and Alignment 
name C-termini of the frag- compared 

ments compared H 



Quantity of Overall Quantity of Probabili- Alignment 

deletions deletion coinciding tyofran- origin 

K length amino acids domcoinci- 

L n dence 1 



IFN-aA 

I FN-/? 

IFN-0 

IFN-y 

IFN-0 

1FN-& 

1FN-/3 

IFN-& 

IFN-/3, 

G-CSF 

G-CSF 

cMGF 

IL-6 

cMGF 

IL-5 

IL-3 



1 
3 
67 
60 
33 
42 
136 
122 

— 1 
22 

I 
1 
2 

4] 

- I 



163. 
166 
112 
104 

67 

77 
165 
151 

70 

92 
177 
178 
184 
175 

74 

33 



164 
46 
37 
32 
74 
182 
183 
37 



55 
14 
11 
9 
18 
72 
40 
1 1 



7.10- 2 * 

310"* 

6I0- J 

8-JO- 2 

310" 3 

M0\« 

410" 7 

At 



(29) 

(46) 

(451 

(45) 

|9| 

(10) 

1 10) 
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(lFN-a t /?, y. IL-6, G-CSF and cMCF) such a conclu- 
sion is natural. So, they are shown to have a statistically 
valid homology in a fragment of the amino acid se- 
quence. Besides, for a number of proteins (IFN-a, /?. 
IL-2, IL-3) the analogies in the structure of hydro- 
phobic cores were revealed [6,7] that provides evidence 
for common features of formation of their three-dimen- 
sional structures. It has been shown for IL-2 and IFN-a, 
P that analogies are observed both in the tertiary struc- 
ture and in arrangement of probable active sites. 

As a result of the analysis of structural interrelation- 
ships between proteins of the given family, some con- 
clusions on their origin can be drawn. So, the fact that 
the C-terminai part of IFN-aA has a statistically valid 
homology' with PTM-a, yields an assumption that the 
IFN-a, genes are products of fusion of a PTM-a, 
gene with a gene of IFN-precursor. The effect of gene 
fusion can be manifested in the protein domain arrange- 
ment. Two-domain organization of IFNs-a, P is evi- 
denced by the data on domain localization obtained by 
means of the method of Vonderviszt and Simon [23] on 
the basis of the protein amino acid sequences. The 
results of difference adiabatic scanning microcalorime- 
try for the recombinant IFN-a 2 indicate the existence 
of two unequal domains in the molecule of this protein 
as well [48]. At the same time the fluorescence polariza- 
tion data indicate the lack of intramolecular mobility in 
the molecule of IFN-a 2 [48]. On the basis of the data on 
homology of primary structure fragments of IFNs-cr, P - 
and IFN-7, it was supposed. in Refs. 3 and 4ithat.the 
IFN-y gene is a product" of recombination of ^segments* 
of IFN-a and -/? genes. 

Thus, one more protein family can be identified in 
the immune system. Similarly to the immunoglobulin 
superfamily [50], the given family seems to include, 
along with the soluble globular proteins analysed in the 
present work, protein receptors, adhesion factors of 

Jymphocytes (LFA-1) and macrophages (Mac-1), whose 
primary structures are homologous to those of IFNs-a, 

^P [51]. In contrast to the immunoglobulin superfamily 
involving 0-pleated proteins, the family of interferon- 
like proteins is a-helical. Nevertheless, a close struc- 
ture-function relationship seems to exist between these 
two families. For instance, the M-CSF receptor belongs 
to the family of immunoglobulin*- like proteins [50). 
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